61 research outputs found
Active Inverse Reward Design
Designers of AI agents often iterate on the reward function in a
trial-and-error process until they get the desired behavior, but this only
guarantees good behavior in the training environment. We propose structuring
this process as a series of queries asking the user to compare between
different reward functions. Thus we can actively select queries for maximum
informativeness about the true reward. In contrast to approaches asking the
designer for optimal behavior, this allows us to gather additional information
by eliciting preferences between suboptimal behaviors. After each query, we
need to update the posterior over the true reward function from observing the
proxy reward function chosen by the designer. The recently proposed Inverse
Reward Design (IRD) enables this. Our approach substantially outperforms IRD in
test environments. In particular, it can query the designer about
interpretable, linear reward functions and still infer non-linear ones
Reducing Exploitability with Population Based Training
Self-play reinforcement learning has achieved state-of-the-art, and often
superhuman, performance in a variety of zero-sum games. Yet prior work has
found that policies that are highly capable against regular opponents can fail
catastrophically against adversarial policies: an opponent trained explicitly
against the victim. Prior defenses using adversarial training were able to make
the victim robust to a specific adversary, but the victim remained vulnerable
to new ones. We conjecture this limitation was due to insufficient diversity of
adversaries seen during training. We propose a defense using population based
training to pit the victim against a diverse set of opponents. We evaluate this
defense's robustness against new adversaries in two low-dimensional
environments. Our defense increases robustness against adversaries, as measured
by number of attacker training timesteps to exploit the victim. Furthermore, we
show that robustness is correlated with the size of the opponent population.Comment: Presented at New Frontiers in Adversarial Machine Learning Workshop,
ICML 202
imitation: Clean Imitation Learning Implementations
imitation provides open-source implementations of imitation and reward
learning algorithms in PyTorch. We include three inverse reinforcement learning
(IRL) algorithms, three imitation learning algorithms and a preference
comparison algorithm. The implementations have been benchmarked against
previous results, and automated tests cover 98% of the code. Moreover, the
algorithms are implemented in a modular fashion, making it simple to develop
novel algorithms in the framework. Our source code, including documentation and
examples, is available at https://github.com/HumanCompatibleAI/imitatio
Adversarial Policies Beat Superhuman Go AIs
We attack the state-of-the-art Go-playing AI system KataGo by training
adversarial policies against it, achieving a >97% win rate against KataGo
running at superhuman settings. Our adversaries do not win by playing Go well.
Instead, they trick KataGo into making serious blunders. Our attack transfers
zero-shot to other superhuman Go-playing AIs, and is comprehensible to the
extent that human experts can implement it without algorithmic assistance to
consistently beat superhuman AIs. The core vulnerability uncovered by our
attack persists even in KataGo agents adversarially trained to defend against
our attack. Our results demonstrate that even superhuman AI systems may harbor
surprising failure modes. Example games are available https://goattack.far.ai/.Comment: Accepted to ICML 2023, see paper for changelo
Surgical Treatment of Renal Cell Cancer Liver Metastases: A Population-Based Study
Background: To evaluate outcomes of surgical treatment in patients with hepatic metastases from renal-cell carcinoma in the Netherlands, and to identify prognostic factors for survival after resection. Renal-cell carcinoma has an incidence of 2,000 new patients in the Netherlands each year (12.5/100,000 inhabitants). According to literature, half of these patients ultimately develop distant metastases with 20% involvement of the liver. Resection of renal-cell carcinoma liver metastases (RCCLM) is performed in only a minority of patients. Hence, little is known about outcome of resectable RCCLM. Methods: Patients were retrieved from local databases of theNetherlands Task Force for Liver Surgery (14 centers) and from the Dutch collective pathology database. Survival and prognostic factors were determined by Kaplan-Meier analysis and log rank test. Results: Thirty-three patients were identified who underwent resection (n = 29) or local ablation (n = 4) of RCCLM in the Netherlands between 1990 and 2008. These patients comprise 0.5% to 1% of the total population of patients diagnosed with RCCLM in that period. There was no operative mortality. The overall survival at 1, 3, and 5 years was 79, 47, and 43%, respectively. Metachronous metastases (n = 23, P = 0.03) and radical resection (n = 19, P < 0.001) were statistically significant prognosticators of ov
Recommended from our members
Transcriptional profiling identifies an androgen receptor activity-low, stemness program associated with enzalutamide resistance.
The androgen receptor (AR) antagonist enzalutamide is one of the principal treatments for men with castration-resistant prostate cancer (CRPC). However, not all patients respond, and resistance mechanisms are largely unknown. We hypothesized that genomic and transcriptional features from metastatic CRPC biopsies prior to treatment would be predictive of de novo treatment resistance. To this end, we conducted a phase II trial of enzalutamide treatment (160 mg/d) in 36 men with metastatic CRPC. Thirty-four patients were evaluable for the primary end point of a prostate-specific antigen (PSA)50 response (PSA decline ≥50% at 12 wk vs. baseline). Nine patients were classified as nonresponders (PSA decline <50%), and 25 patients were classified as responders (PSA decline ≥50%). Failure to achieve a PSA50 was associated with shorter progression-free survival, time on treatment, and overall survival, demonstrating PSA50's utility. Targeted DNA-sequencing was performed on 26 of 36 biopsies, and RNA-sequencing was performed on 25 of 36 biopsies that contained sufficient material. Using computational methods, we measured AR transcriptional function and performed gene set enrichment analysis (GSEA) to identify pathways whose activity state correlated with de novo resistance. TP53 gene alterations were more common in nonresponders, although this did not reach statistical significance (P = 0.055). AR gene alterations and AR expression were similar between groups. Importantly, however, transcriptional measurements demonstrated that specific gene sets-including those linked to low AR transcriptional activity and a stemness program-were activated in nonresponders. Our results suggest that patients whose tumors harbor this program should be considered for clinical trials testing rational agents to overcome de novo enzalutamide resistance
- …